Gemma 3 12b It Quantized W4A16
Gemma 3 is an instruction-tuned large language model developed by Google. This repository provides its 12B parameter W4A16 quantized version, significantly reducing memory requirements while maintaining good performance.
Large Language Model
Transformers